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^ ; 1 Introduction and Results 

. 5^ , 1.1 Motivation and Notation 

In this article, we study the coding problem for real- valued Levy processes X {original) under 
L^[0, l]-norm distortion for some fixed p S [l,oo). Here we think of X being a B[0, oo)- valued 
process, where D[0, oo) denotes the space of cadlag functions endowed with the Skorohod topol- 
ogy. We shall denote by || • || the standard -^^[0, l]-norm. 

Let < s < oo. The objective is now to find a cadlag real-valued process X (reconstruction 
or approximation) that minimizes the error criterion 

llii^-*iiL.,„ = (^""'^ m 

^^^^ \esssup||X-X|| ifs = oo 

under a given complexity constraint on the approximating random variable X. We will work with 
the following three complexity constraints that have been originally suggested by Kolmogorov 
fe^l: for r > 0, 
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• log I range (X) I < r {quantization constraint) 

• H{X) < r, where H denotes the entropy of X [entropy constraint) 

• I{X;X) < r, where I denotes the Shannon mutual information of X and X (mutual 
information constraint). 



We will work with the following standard notation for entropy and mutual information: 
H{X) 



— Px log Px if X is discrete with probability weights (p^ 
oo otherwise 



and 



{Olr 
oo otherwise. 

Here, ¥z denotes the distribution function of a random variable Z. 

When considering the quantization constraint, we get the following minimal value 

D^i\r,s) := infill 111 ll^^^p) : log I range (1)1 <r}, 

which we call the (minimal) quantization error for the rate r > and the moment s. Analo- 
gously, we denote by D^^\r,s) and D{r,s) the minimal values under the entropy- and mutual 
information constraint, respectively. D^*^^ and D will be called entropy coding error and distor- 
tion rate function, respectively. We have D ^ D^'^^ ^ Z)^'^), for any random variable. 

The quantization constraint naturally appears, when coding the signal X under a strict 
bit-length constraint. The entropy constraint corresponds to an average bit-length constraint 
and the mutual information constraint gains its importance from Shannon's celebrated source 
coding theorem. In this article we will not consider the run time behaviour of our coding 
schemes. However, we think that the approximation schemes (provided later in the article) have 
implementations with reasonable runtime behaviour. Strictly speaking, the quantities D^^^ and 
D depend on the probability space. However, this dependence has no effect on our results. 

The objective of the article is 

• to provide efficient coding strategies for general Levy processes that arc parameterized by 
three parameters and that are robust under a mismatch on the Levy measure and 

• to complement the estimates by appropriate lower bounds that show weak optimality of 
our scheme for most cases. 

In the article, X = (^t)te[o,oo) denotes a Levy process in the Skorohod space IDfO, oo), that is 
a process starting in with independent and stationary increments. Due to the Levy Khintchine 
formula, the characteristic function of each marginal Xt (t £ [0, 1]) admits a representation 

Ee^"^* = e-*'^^"), (2) 

where 

2 f 

'^{u) = —u^ + ibu+ (1 - e'"^ -I- lnx\<niux) u{dx) 

2 JW\{o} 
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for parameters G [0, oo), 6 G M, and a positive measure v on M\{0} with 

/ 1 Kx^v{dx) < oo. (3) 

On the other hand, for a given triplet (z^, cr^,6) there exists a Levy process X such that 1^ 
is valid, moreover the distribution of a Levy process X is uniquely characterized by the latter 
triplet. We will call the corresponding process an (i/, o"^, 6)-Levy process. 
If dl]) is true for 

^{u) = %-u^+l {I- e^""^ + iux)v{dx), 
2 Jr\{o} 

then we will call X a {v, (J^)-Levy martingale. Note that such a representation implies that 
J l^l A iy{dx) is finite and that the Levy process X is a martingale in the usual sense. 

After stating our main results in Section 11.21 we shall list some important examples in 
Section [1.31 Then Section [2] is devoted to the analysis of a particular coding scheme. The coding 
strategy of interest will be a measurable function 

G = e,,b,„:B[0,l)^B[0,l) 

depending on three parameters e > 0, 6 G M and m > 0. The parameter e will be responsible for 
the quality of the reconstruction, in the sense that lower e correspond to lower approximation 
errors. The parameters h and m have to be adjusted to e and certain quantities relying on 
the Levy measure. Namely, the coding scheme presented below works in a weakly optimal way 
(in the sense of both quantization constraint and entropy constraint coding error) if m = m{e) 
is the mean number of jumps to be encoded and b = 6(e) is a drift compensation term. If 
the generating triplet of the Levy process is given, these parameters are explicitly available for 
computation. If the generating triplet is not known, these values can be estimated from the 
data. 

In Section [3l we derive lower bounds for the above coding problems. Together, these results 
show that the provided coding scheme is weakly optimal in many cases. 

Throughout, we use the following notation for strong and weak asymptotics. For two 
functions / and 5, f{x) ~ ^(x), as x ^ 0, means that f{x)/g{x) — > 1, as x — > 0. On 
the other hand, we use the notation /(x) < ^(x), as x — > 0, if \\m.x^Q f {x) / g{x) ^ 1. We 
also write g{x) > f{x) in this case. Furthermore, we write /(x) ~ g{x), as x — > 0, if 
< liminf^^o/(a;)/5'(a:^) ^ ^"^"^^^Vx^Q f{^)l di^) < 



1.2 Results 

The crucial quantities describing the coding complexity of Levy processes are 

Fi{e) :=e~'^ (^a"^ + J x"^ Ae'^u{dx)^ and F2{e) := j log (|x|/e) z^((ix). 

Furthermore, we shall use F{£) := Fi[e) + F2{£). The function integrated by the Levy measure 
is visualised in Figure [TJ Note that ^ does not ensure the finiteness of F2 and that F2 is either 
finite or infinite for all e > 0. 

We are now in a position to state the main results of the article. Let us start with the 
entropy coding error. 
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Figure 1: Visualization of the function F 



Theorem 1.1. There exist constants ci = ci{p) > and C2 > such that, for arbitrary Levy 
processes with finite F2, any s > 0, and all e > 0, 

L»(^)(ciF(e),s) ^ C2e. 

Similarly to the entropy coding error, we obtain the upper bound for the quantization error. 
Theorem 1.2. Assume that there is a q > s such that 
(a) E||Xf < 00, 
(h) for some fJ- > 0, 

f, ,^ (\x\/eYu(dx) 

(4) 



hmsup 77 ^— < 00. 

i/([-e,e]^) 



Then there exist a constant ci = ci {p, u) > and a universal constant C2 > such that, for all 
< e < £0 = £o{i^,s,p), 

D(''\c,F{e),s) ^C2e. 

In the proofs of the upper bounds we only need to consider the case where F2. Indeed, in 
the second theorem, assumption (a) implies the finiteness of F2. 

Remark 1.3. Let us comment on the conditions in Theorem II. 2t Condition (a) is natural, 
though one could soften it by the use of Orlicz norms. Moreover, condition (b) is needed to 
guarantee that typical realizations of the Levy process dominate the quantization complexity of 
the process (see equation (jlip ). Essentially, (b) does not hold if the Levy measure is finite or if 
e,e]'^) does not grow to infinity fast enough, when e tends to zero. 
With given Levy measure, it is usually easy to verify conditions (a) and (b), cf. Remarks 12. II 
and 12.21 below. 

Remark 1.4. Another approach for the quantization of Levy processes is taken in Q). There, 
linear quantizers are constructed, and a relation of quantization to the path regularity of pro- 
cesses is outlined. However, as observed by Creutzig linear approximations are not optimal 
whenever s > p. In this article we work with non-linear quantizers, which lead to better -mostly 
weakly optimal- results. 
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The corresponding lower bound reads as follows. 

Theorem 1.5 (Lower bound). There exist universal constants ci,C2,C3 > such that the fol- 
lowing holds. For every Levy process X with finite F2, any e > with Fi(e) ^ C3 one has 

D{ciF{e),l) ^ C2e. 
Moreover, if z^(M) =00 or a ^ 0, one has for any s > 0, 

D{ciFi{e),s)>C2e 
as e I 0. In the case where F2 = 00, one has D{r, 1) = 00 for any r > 0. 

Remark 1.6. So far one cannot replace Fi by F in the second statement of Theorem II .51 Since 
mostly Fi and F are weakly equivalent when e tends to zero, the second estimate typically leads 
to sharp results. Nevertheless, it would be interesting to find out, whether one can close this 
remaining gap. 

Note that we have not specified the basis of the logarithm. However, all results stated above 
are valid for any basis. The choice of the basis has only an influence on the constants in the 
theorems. We will work with the basis 2 when proving the upper bounds, since this seems more 
appropriate in the context of binary representations. When proving the lower bounds we switch 
to the natural logarithm. 

1.3 Examples 

In this subsection, we apply the above results to some common Levy processes. 

Example 1.7 (Stable Levy process). Let us consider the case of an a-stable Levy process. Here 
we have z^(dx) = (Ci l{2,<o}+C'2 l{x>o})l^l~"~^ dx, and one can easily verify that -Fi(e) = C^e~°' 
and F2{e) = C^e"". All assumptions of the main theorems are satisfied and we conclude that 
for all moments si > 0, S2 G (0, a) and all p 1, 

D{r, si) ^ D^^\r, si) « D'^''\r, S2) ~ r'^/". 

This improves results from and 0. 

Note that the coding complexity a-stable Levy process is smaller than the one of a 2-stable 
Levy process, i.e. Brownian motion. In fact, this is true for all Levy process. 

Example 1.8 (Levy process with non-vanishing Gaussian component). It is easy to calculate 
that Fi{s) ^ c£-^ for i = 1, 2. Therefore, if cr / then F{e) Fi{e) « e'^. 

This has two implications. Firstly, in presence of a Gaussian component, the coding com- 
plexity of the Levy process is the same as for Brownian motion, as long as our results apply. In 
case o" = 0, the coding complexity is weakly bounded from above by that of Brownian motion. 

More precisely, 

D^^\r,s) ^ Cr^"*^/^, for any Levy process, 

and 

D^^\r,s)^r-^'^, iffj/O. 



5 



On the other hand, under the assumptions (a) and (b), 

D^'^\r,s) ^ Cr^^/^, for any Levy process, 

and. 



r, s 



-1/2 



if a / 0. 



In fact, by a modification of (lllh one can show that (b) is not necessary if cr 7^ 0. 

Example 1.9 (Gamma process). Let us consider a standard Gamma process. In this case, 
u{dx) = l^^yQyx~^e~^dx and one gets Fi{e) ^ logl/e and F2{e) (logl/e)^. Consequently, 
for fixed p,s£ [1, 00), there exist constants ci, C2, c[,C2 G M+ such that for all e ^ 



and 



D^'\ci{logl/ef,s)^C2e 
D (c'l (logl/e) 2, s) ^c'2£. 



Therefore, 

D{r,s) = exp(-e^(^) V^) and Z)(^)(r,s) = exp(-e^(^) ^/r). 
Note that Theorem 11.21 does not apply since condition (jj]) fails to hold. 

Example 1.10 (Compound Poisson process). Let {N{t))t ^ he a standard Poisson process. Let 
furthermore Y,Yi,Y2, . . . be i.i.d. random variables that are not a.s. equal to and independent 
of the Poisson process. Then 

N(t) 

m ■■= 

i=l 

is a compound Poisson process, i.e. a Levy process with Levy measure v = Py and drift b = 



It is immediately clear that -Fi(e) ^ 1 and F2{s) ~ E 
nates F when e is small. Thus the main complexity is induced by the "large jumps" . For fixed 
p,s € [1,00), the main theorems imply the existence of constants ci,C2,c[,C2 G M_|_ such that 



log 



m 

e 



so that F2 domi- 



L>(") ( ciE 



log (M) ,„ 



and 



Hence, 



D ( c[E 



log ( — 



Y\ ^e} 



D{r,s) = exp(-e'^(^)r) and D^''\r,s) = exp(-e'^(^) r) . 



A more precise result for a subclass of compound Poisson processes was already obtained in the 
dissertation of Vormoor 0]. In particular, in those cases, the rates of quantization and entropy 
coding error differ. 

Note that in the case of a compound Poisson processes we cannot use Theorem 11.21 on the 
quantization error, since condition (b) is not satisfied. 
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2 Upper Bounds 



2.1 An Explicit Coding Strategy 

In this subsection, we describe an explicit coding strategy that can be used to encode a Levy 
process. We derive that the strategy has a mean error of order e and that the bit complexity is 
given by the quantity in (jlOp . In the following subsections we use this strategy in order to prove 
upper bounds for the entropy coding error and the quantization error. 

The reconstruction X = Qs^^b^mi^) will be a step function with the step heights being integer 
multiples of e, i.e. we use an eZ grid to approximate X. For this purpose, let us define 5 to be a 
nearest neighbour projection of M onto eZ. As a first step, we subtract the drift of the process 
by setting X'{t) := X{t) — b{e)t, where b{e) is a drift compensation term given by 

b{e) := b — / xi'{dx)+ / xv{dx). 

J[-l,l]\[-eA J[-e,£]\hl,l] 



Notation. Set := and let 

Si := inf {t > : \X[ - 5(^5,, J I ^ 2e} ; i G N, 

be the first exit time of the process (^X^ — g{X'c,, from the interval [— 2e, 2e]. 

Let M := max{i : S-i < 1}. Some of the stopping times Si are induced by jumps larger than 
e. These shall be called large jumps. 



Coding procedure. Note that it is possible to detect the jump points (-S'i)i=i,...,M by a single 
swipe through the interval [0, 1] . For each jump we encode its height and its time separately 
by using prefix-free representations: we use a prefix-free representation for the integers Ti : 
~^ {0) 1}* outlined in Lemma 12. 4p to code the number Hi/e G Z, where Hi := 
g{X'g,) — g{X'g_ ) denotes the discretised height. Moreover, the time approximation Si to Si is 
chosen in such a way that 

Si^Si<Si+i and Si-Si^eP/{\Hi\PM). (5) 

For a visualization, cf. Figure [2l Concretely, we choose Si as follows. By Lemma 12.51 there is 
a coding scheme T2 : M x M>o ~^ {0,1}*, where, for r G [0,1], (5 > 0, T2{r,5) is the binary 
representation of a number T2(r, 5) G Un>o 2~'^Z n [0, 1] such that T2(r, 5) G [r, r + 5]. 

We transmit the information in the following way: we divide the interval [0, 1) into [-Fi(e)] 
'boxes' (i.e. intervals) Ij = [jFi(e),(j + l)Fi(e) A 1), j = 0, ... , rFi(e)] - 1. Each jump Si 
(i = 1, . . . , M) is translated into the code 

TTi := '0' * Ti{Hi/e) * T2 {Fi{e)Si - lFi{e)Si\, Fi{e) - Si) A F,{e)eP /m\PM)) , 

where * denotes the concatenation of strings. Note that Fi{e)Si — [Fi{e)Si\ is exactly the 
difference between the actual jump point and the left corner of the box, scaled on the unit 
interval. Then each block j is described by the string 

:= Yl 
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Figure 2: The coding procedure 
and finally the complete information is encoded as 

rFi(e)l-l 

n (n. *'!')• 

j=0 

It is easy to check that this provides indeed a prefix-free representation of ((S'j, -ffi)j=i^,,,^M) > 
and the corresponding approximation defines a deterministic map &e,b{£),Fi{e) by 

AI 

Xt := @eMe),F,ie){Xm := h{e)t + Y,Hi \s.<^tV 

i=l 

where 

Si := abox,* + T2 {Fi{e)Si - lFi{e)Si\,F,{e) (5^+1 - S,) A F,{e)eP /{\Hi\PM)) 

and abox,« is the left corner of the box that contains Si. Note that, in order to decode this value, 
it is sufficient to transmit a code for Fi{e)Si — [Fi{e)Si\ . The chosen precision ensures ([5]). Note 
that the parameters e, 6(e) and Fi{e) describe the approximation scheme uniquely. 

For convenience we will also consider the drift adjusted reconstruction X' defined by 

M 

Waiting time for the jumps. Let us estimate the waiting time for subsequent jumps. For 
this purpose, let X^^^ be the process consisting of the (finitely many) jumps of X' that are greater 
than e and set X^"^^ := X' — X^^\ Note that X^"^^ is a (i^|[_£ , CT^)-Levy martingale. Denote by 
Fi the stopping time induced by the first jump of X^^\ Note that |-'^5._-^ ~ 9{-Xs-_-^)\ ^ s/2 a.s. 
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so that due to the strong Markov property one has for ah t ^ 0, 

F {Si - ^ t I ^ P f sup \XP \>^e)+¥ (Fi ^ t) 

\o<s <; t ^ / 

^ (3e/2)^2]E gup + 

0<s ^ t 

where the last step is justified by Doob's martingale inequality. By the compensation formula 
([H, p. 7) the last term equals Fi{e)t. 

Let Ui,U2, ■ ■ ■ be a sequence of i.i.d. random variables. Then we have shown that for all 
jumps Si, 

P (5, - Si.i ^ t I :Fs,_,) ^ max(tFi(e), 1) = P {U, ^ Fi{e)t) , 

for alH ^ and i G N. Consequently, we can couple the random times {Si — Si-i)i ^ i with the 
sequence {Ui)i ^ i such that 

^ (6) 

Coding error. First, let us analyse the error of the approximation. With X' = ('7(^t))tG[o,i] 
one gets 

\\x - x\\ = \\x' - x'w < \\x' - x'w +\\X' - X'\\. (7) 

Moreover, due to property ([5]) 

\\X' - X'r = \H^\''{S^ - S,) < eP, (8) 
1=1 

so that \\X -X\\< 3e. 

Coding complexity. Let us count the number of bits needed in the approximation: 

• Each change in a block is indicated by a '1' which gives in total [-Fi(e)] bits. 

• Each pair (Hi, Si) is initialized by a '0' which gives in total M bits. 

• Coding the numbers Hi/e, . . . , Hm by using an appropriate representation Ti needs less 
than 




M 



2(2 + log 

4 = 1 

bits by Lemma 12.41 
• Coding the numbers Si, ... , Sm needs less than 

M 



g 2 (^2 + log+ ^p/(^|^^|p)^(5^^^_5^^)J 



bits (see Lemma [23]) . 
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Therefore, the total bit-length is bounded from above by 



M 



i=l 



1 M + 1 



This equals 



M 



log h log^ 

e 



V 



+ 8M+ 



SM+\Fi{e)-\. 



By (j6]) and the inequality log_,_(x V y) ^ log^ x + log_,_ y, the latter is less than 

M 



1=1 



liiT' I 1 
(1 + p) log+ — ^ + log+ — 

E U i 



+ 2Mlog^ 



M 



+ 8M+ [Fi(e)]. 



Next, recall from ([6]) that Fi(e)(5j — Si-i) ^ f/j so that 



M 



M 



Fi{e)M^ Y,Fi{e){Si-Si^i)^ ^Uf, 

i=l i=l 

and using the convexity of log^(l/-) one gets with Jensen's Inequality 

^ 1 1 1 1 M 

log — = M "S^ — log — ^ M log , — jrr-FT- > M log , — -- 

^ Ui ^ M ^Ui ^+ y^'\ §1 ^+ File) 

1=1 ^ 1=1 ' Z-ii=l M '-^ ' 



We conclude with ([9]) that 

M 



i=l 



(l+p) log+^ + 21og-^ 



+ 8M+ [Fi(e)] 



is an upper bound for the bit-length. 
We conclude with dH) that 



M 

^E 

i=l 



(l+p)log+^ + 21og-^ 



+ 8M+ [Fi(e)] 



(9) 



is an upper bound for the bit-length. Denoting for any time t > the jump at time t by 
AXt = Xt — Xt- allows us to estimate l-f^il ^ -|- |e so that basic analysis gives 

log+ ^ ^ 5 + log+ J ^. 

e £ 

Consequently, the bit-length is bounded by 

M 



4 E log 77 + 2(1 +P) E log+^^ + (18 + 10p)M+rFi(£)l 

' te{o,i] ^ 



i=l 



M 



^i^i(p)f;[l + logl]+E:2(p) 5] log^ 

te(o,i] 



+ +1, 



(10) 



where Ki[p) and K2{p) are constants only depending on p. 



10 



2.2 Proof of Theorem [TT] 

Proof. By ([7]) and ([8]) the error (and thus the mean error, for all moments s > 0) is less than 
3e. 

On the other hand, the coding complexity of the algorithm constructed above is given by 
(jlOp . Let us look at what the different terms amount to on average. Note that 

E Y: log+^ = F2(e), 

t6(0,l] 

by the compensation formula ([^, p. 29). Finally, by Lemma 12.31 we have 

M 

E^(l + logC/,-i) ^cFi(e). 

i=l 

This shows that the expected bit length of the whole message is less than ciF{e), with some 
constant ci depending only on p, as required. □ 



2.3 Proof of Theorem [TH 

Proof. We use the coding scheme explained above. However, we encode by the zero function in 
case that the number of small jumps, M, exceeds CiF{e), where Ci is a constant to be chosen 
presently. The same is done if the complexity to encode the jump heights of the large jumps, 
namely J2te(o i] \ ^-^t\/£, or the complexity to encode the positions of the jumps, namely 
E»=i(l + log f^r^)' is larger than C2F(e), where C2 is a constant to be chosen presently. Let us 
define T to be the event that none of the above cases occurs, i.e. the 'typical case'. 
Note that, by the exponential compensation formula p. 8), 

log^ > C2F{e) ^ e-^^>'^^''^Ee^^w&^+ —f^ 

^tG(0,l] ^ / 

^ f,-C2^lF{e)^- /|,| ^ , l-{\A/er Hdx) ^ ^-C2tiF(e) ^EF(e) ^ ^-C2/2^lF{e) ^ 

where E is some constant depending on the finite constant in Q only. The last step holds for 
C2 large enough. On the other hand, by the Chebyshev Inequality, 

Y (1 + logf/r') > C2F{e) \ ^ e-^2/2^(^), 

for C2 large enough. Finally, one proves, e.g. using the same discretization as in (I13p . that for 
Ci large enough, 

P(M > CiF{e)) ^ e-^i/2F(£)^ 
Therefore, for some positive constant C depending on /x and E^ we have P {T'^) ^ exp(— C-F(e)). 
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Let r > be chosen by 1/q + 1/r = 1/s. Let k > be chosen small enough such that 
Cu{[—k,kY) ^ r. This is possible, since v{[—k,kY) tends to infinity when k — > 0, by condi- 
tion (b). Then, for e < k, 

F{e) ^ F2{e) = [ \og^-^u{dx) ^ ^.([-k, k]^) log - ^ - ^loge^ 



Thus, 



(12) 



Note that the bit complexity of our algorithm is constant if T'^ occurs and, by (jlOp . less than 
CF{e) if T occurs, where C depends on n and E. Then we have for the mean error, using the 
Holder Inequality and s ^ 1, 



E 



X -X 



X -X 



+ Eire 



X -X 



s\ 1/s 



^ C2e + (E1^.)^/^(E||X 



19^1/9 



l + e-ic-ip(T^)^/" (EllXf)^/"' 



where the term in brackets is bounded, by assumption (a) and (jl2p . Note that the argument 
works analogously for < s < 1. □ 



Remark 2.1. It is easy to see that condition (a) is equivalent to the condition 

/ Ixl"^ ^{dx) < CO. 

J\x\>l 

Remark 2.2. Let us assume that (a) holds. A sufficient condition for (b) to hold is that 
2e, 2e]'^) ^ c • vd—e, e]^) for some < c < 1 and all < e ^ Eq. This can be seen as follows: 

/ i^idx)^ V / ^ u{dx) 

Je<\x\ s^eo \ ^ J ^ J2ke<\x\ ^ 2'=+ie V ^ / 

log(eo/£) oo 

k=0 k=0 

Choosing < /i < (— log c) A q yields 

f OAY i,^dx) i:K{fi,c)u{[-e,e]'')+e-^' [ \x\^ u{dx) + e-f" [ \x\'^iy{dx), 

J\x\>e \ ^ J Jeo<\x\ ^1 J\x\>l 

which implies (j3|). 

Note that, in particular, this is the case if e z^([— e,e]^) is regularly varying at zero with 
negative exponent. 
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2.4 Technical tools 

In this section, we prove some technical tools that are needed in the proofs of the main results. 

Lemma 2.3. Let A > and let {Ui)i^i be an i.i.d. sequence of random variables uniformly 
distributed in [0, 1]. For N := min{n G N : Yl7=i ^ -^l ""^^ ^'^^ 

N 

E5;(l + log[/ri)^6[2Al. 

i=l 



Proof. Let ^ s ^ 1. Define N{s) := min{n G No : X^ILi ^ '^l consider the function 

N{s) 

:=E^(l + logC/ri). 

i=l 

We are interested in \1'(A). Clearly, ^'(s) = for s ^ and ^ is increasing. Moreover, one has 
for s > 0, 

/ N(s-x) \ 

*(s) = j (l + logx-^+E ^ (l + logC/-^)j d¥uAx) 

N(s-x) 

= 1+ / -logxdx+ /E V (l + logC/ri)dP[/i(x) 
^0 ./ 



= l + l„g. + /*(»-.)dP„.W<3 + /*(,-.)*„.W. 



Let us define 



^ [1/2 C/i>l/2. ^ ^ 



Then C/( ^ t/i; and since ^ is increasing, we have 

*(s)^3 + y" ^(s-a;)P[;.(a;) = 3 + ^^(s) + ^*(^s-i^ . 
Therefore, *(s) ^ 6 + * (s - i) and we get that 

*(A) ^6 + ^- Ta- ^6 + 6 + ^^ (A -1)^... ^6-r2A]. 



□ 



Let us finally gather two facts concerning the coding of integers and real numbers from a 
given interval, respectively. 

Lemma 2.4. There is a universal coding scheme that returns a prefix free code Ti(x) G {0, 1}* 
for a given integer x G Z that has a length of at most 2(2 + logx) bits. 
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Proof. The sign is encoded by a first bit. Thus, assume x > 0, because x = can be encoded 
by '00'. Let n := mm{l G N | x < 2'}. Then 2"-^ ^ x < 2". Consider the representation of x 
in the binary system. Because of the definition of n, this representation must have n bits, the 
first one of which is a '1'. 

A prefix free code for x is given by n times '1', fohowed by a '0' and the n — 1 bit long 
representation of x in the binary system having taken away the redundant leading '1'. 

The length of the code is 2n + 1, which is less than 2(1 + log^ x). □ 

Let us remark that Lemma [2^ can be improved up to the order logx + Clog log x + D, as 
shown in 0]. 

Lemma 2.5. There exists a universal coding strategy T2 : M x ]R>o {0, 1}* such that, for 
any S > and r G [0, 1], T2 returns the prefix free binary representation T2{r,5) of a number 
T2(r, 5) £ [r, r + 1] with r ^ T2{r,6) ^ r + 6 that needs at most 2(2 — log 6) bits. 

Proof. Let := min{n : 6 ^ 2^"^^}. We choose T2(l,r, (5) G [r, r + 1] n Sn nearest 
possible, but larger than r, where 

N 

Sn ■■= U 2-Z. 

n=0 

This ensures that ^ T2(l,r,5) - r ^ 2-(^+i) ^ 5, as required. 

Any number r G [0, IJnS'Ar has a unique representation f = A;2~", with k uneven, 1 ^ ^ 2" — 
l, 1 ^ n ^ A^. As a prefix free code T2il,r,5) for T2(l,r, (5) we chose the prefix free code for 
the integer 2""-*^ + {k + l)/2. Since T2(7, (5) G Sn, we have to encode integers from 2 up to at 
most 2^, which, by Lemma 12.41 requires at most 2(1 + A^) bits, which is less than 2(1 — log 5) 
bits, by the definition of A^. □ 



3 Lower bound 

The aim of this section is to provide lower bounds for the distortion rate function of the Levy 
process. The analysis is divided into three subsections. First we introduce some concepts of 
information theory and we prove some preliminary results. Next, we provide a lower bound 
based on F2. In the last subsection we give a lower bound in terms of Fi. Both lower bounds 
then immediately imply Theorem II. 5 i 

So far p is a fixed value in [1,cxd). Since the distortion rate function is increasing in the 
parameter p, we can and will fix p = 1 in the following discussion. 

As mentioned before, we can freely choose the basis of the logarithm in the proof of the main 
theorems. For the rest of this article, we fix as basis e. 

3.1 Preliminaries 

First we will introduce some concepts of information theory. We will need the concept of 
conditional mutual information. Let A, B and C denote random vectors attaining values in 
some Borel spaces. Then one defines the mutual information between A and B given C as 

I{A;B\C)= [ I{A;B\C = c)dFcic), 
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where 

[oo otherwise. 
A summary of computation rules for the mutual information can be found in [^. 

Lemma 3.1. For n £ N, let Yq, . . . ,Yn-i and Yq, . . . ,Yn-i and H denote random variables 
in possibly different Borel spaces. We write shortly Y = (Yq, . . . , Yn-i), Y^ = {Yq, . . . , 1^) for 
^ i ^ n — 1 and Y = (Yq, . . . , Yn-i). Then one has 



n-2\ 



liY, H; Y) ^ I{Yq; Yq\H) + I{Yi;Yi\H, Y^) + ■ ■ ■ + I {Yn-i;Yn-^i\H , Y 

Moreover, we will need to evaluate the distortion rate function for other originals than the 
Levy process X and for other distortions than Lp[0, l]-norm. For a measure /x on a Borel space E 
and a measurable function p : E x E ^ \f),(X)\ {distortion measure) we write 

D{r\p,p) = inf{E[p(X,X)] : X £;-valued r.v. with I{X;X) ^ r}. 

Moreover, we associate to a map p : E ^ [Q, oo] the difference distortion measure p : E x E ^ 
[0,oo] (denoted by the same identifier) given as p{x,x) = p{x — x). Sometimes we will also 
consider a general moment s > and write 

D{r\p,p,s) =mi{&[p{X,Xy]^/' : X ^-valued r.v. with I{X;X) ^ r}. 

Moreover, we will omit p if it is the norm based distortion induced by the L^[0, l]-norm. 

The following proposition allows us to separately consider the influence of the large jumps 
and the diffusive part with small jumps onto the coding complexity of the Levy process: 

Proposition 3.2. Let E be a Borel-space and assume that (E, +) is an Abelian group such that 
the sum is Borel-measurable. Denote by A and B independent E-valued random elements and 
suppose that there exists a measurable map if : E ^ E"^ with 

ip{A + B) = {A,B) a.s. (14) 

Then, under any difference distortion measure p on E, one has for every r ^ 0: 

D{r\FA+B,p)>D{r\FA,p). 

Proof. Fix r ^ 0. Next, we use that the distortion rate function D{-\¥a,p) is convex. We 
denote by / a tangent of D{-\Fa,p) at the point r. Then, for any random element Z on E, 

E[p{A + B, Z)] = j E[p(A, Z-b)\B = b] dPeib) 

^ I f{I{A;Z\B = b))dPB{b) = f[j I{A;Z\B = b)dPB{b)) 
= f{I{A;Z\B)). 



Therefore, 



inf np{A + B, Z)\ ^ f(r) = Z)(r|P^, p). 

{Z:I{A;Z\B) ^ r} 



15 



On the other hand, by assumption ()14p . I{A + B;Z) = I{{A, B); Z) for any random element Z 
on E. Hence, 

liA + B-Z)= I{{A, B); Z) = I{B; Z) + I{A; Z\B) ^ I(A; Z\B). 

Therefore, 

D{r\¥A+B.p)= inf E[p(^ + 5, Z)] 

{Z:I{A+B;Z) ^r} 

^ inf E[/>(^ + S,Z)] ^ Z)(r|PA,p). 

{Z r.v. on E: 

I{A;Z\B) ^ r} 

□ 



3.2 Lower bound based on F2 

Theorem 3.3. There exists some universal constant c such that for all e > 0, 

i^(^F2(6)|x,Li[0,l],l) ^cK{e)e, 

where K{e) = k{s,u) = [iy{[—e,eY)\/iy{[—e,eY). 

The proof of the theorem is based on the following idea: in order to find an approximation 
of accuracy e, one needs to allocate about log_,_ \Xt — Xt-\/e bits (nats) for each big jump. 

The problem is related to a minimization problem that we want to introduce now. Let 11 
be a finite non-negative measure on a measurable space {E,£) and let h : E ^ [0, 00) denote a 
Borel-measurable function with 

J log_,_ h{x) dll{x) < 00. 
The aim is now to minimize for given r > the target function 

h{x) exp(— ^(x)) Il{dx) 
over all measurable functions ^ : -E ^ [0, 00) satisfying the constraint 

C(x) dU{x) < r. (15) 



Lemma 3.4. Assuming that {h > 0} has not H-measure zero, the minimization problem pos- 
sesses a Il-a.e. unique solution of the form 

e(x)=log+^, (16) 

where A = A(r) > is an appropriate parameter depending on r > 0. When the optimal 
function ^ is as in ( fifil) . then the minimal value of the target function is 

XAh{x)n{dx). 
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Proof. The proof is based on a Lagrangian analysis. Let C{y) = exp(— y) {y £ [0, oo)) and 
consider its convex conjugate 

C{z)= inf [ay) + yz] (z^O). 

Let A > and denote by H the cj-finite measure with ^(a^) = h{x). Now observe that for a 
non-negative function ^ satisfying the constraint (fT5]) one has 

J h{x) exp(-e(x))dn(x) ^ J [C(e(x)) + a||||] dU{x) - \r (17) 

The last expression in this estimate does not depend on the choice of ^. If we can establish 
equality in the above estimates for certain ^ and A, then this ^ minimizes the problem. 
Next, we note that one has equality in (jl7p iff 

i f C{x) dU{x) = r and ^^^^ 
[^{x) = for Il-a.e. x with h{x) = 0. 

We need to look for a non-negative function ^ and a parameter A > such that (fT9]) is valid and 
such that 

It is straightforward to verify that for positive z the function 

[0, oo) B y ^ Civ) + zy e (0, oo) 
attains its unique minimum in y = log_|_ K Therefore, condition (j20p is equivalent to 

h(x) 

^{x) = £,x{x) := log_|_ — - — for Il-a.e. x. 

A 

Together with (jl9p a sufficient criterion for ^ being a minimum is the existence of a A > such 
that 

{/ ^{x) dll{x) = r and 
^{x) = i\{x) for Il-a.e. x. 

Such a A exists since the function 

5 : (0, cx)) 9 A j ^x{x) dx G [0, oo) 

is continuous (due to the dominated convergence theorem) and satisfies 

\\m.g{\) = oo and lim g{X) = 0. 

AJ.0 \-^oo 

Note that if ^ does not coincide with Il-a.e. (where A is such that ^(A) = r), then one of the 
inequalities (jl7p or (jlSp is a strict inequality so that ^ does not minimize the target function. □ 
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Proof of Theorem 13.31 Fix e > 0. Due to Proposition 13.21 we can assume without loss of 
generality that X is a pure jump process with jumps bigger than e. Next, let / = \/v[[—e^eY)^ 
n = and 

nl f , , , , K(e) ^ , , 

r = — / log — u{dx) = F2(e). 
e e e 

We will prove that for an arbitrarily fixed reconstruction X with I{X; X) ^ r one has 



E[\\X -X\\L,[o,i]]^cnle, 



where c > is a universal constant. 
We let 



/ /■(*+^)' dt 
n : Li[0, 1] ^ i^, (xt) ^ ( J^^ {21{t ^ (2i+i)//2} - ^)xt j 



i=0,...,n— 1 



and consider 

Y = (y.)i=o,...,n-i = 7r{X) and Y = 7r{X). 
The map vr is /"^-Lipschitz continuous so that 

n\\Y-Y\\er.]^l-'E[\\X-X\\]. (21) 

Moreover, vr is invariant under uniform shifts on each time interval [i/n, {i + l)/n) so that in 
particular, 

n-l 

■k{X) = -ni^X - ^X2z±i,ll[i«,(i+i)i)). 

Due to the strong Markov property of the Levy process, the random variables Yq, . . . , l^-i are 
i.i.d. We shall derive a lower bound for E[||y — 
For i = 0, . . . , n — 1 consider the events 

Ai = {X contains in [z/, [i + 1)1) exactly one jump}. 

and the random vector H = (-ffi)j=o,...,n-i given by 



Hi 



size of the jump in [i/, {i + 1)/) if Ai occurs, 
otherwise. 



Next, denote Y"^ = (Iq) • • • > Yi) for i = 0, . . . , n — 1 and y ^ = 0. Our objective is to find a lower 
bound for 



n-l 

E[\\Y -Y\\,^]^E^E[\lAYi-lAyi\\HX-^]\ (22) 

i=0 



For each i G {0, . . . ,n — 1} we analyze the inner expectation. Let fi{h,y^ '^) = I{Yi,Yi\H 
h, y*~i = y*^^) and consider the random variable 



Ri = fi{H,Y 
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Given H and ^, the r.v. Yi is uniformly distributed on l^Hi>o}Hi/'^]- Therefore, 

E[\lAYi - 1a,Y, I \H,Y'-'] ;^D{R,\U[0,\Hi\/2],\-\), (23) 

where Z//[0,ii] denotes the uniform distribution on [0,u]. Now there exists a universal constant 
c > such that for any f ^ and any u ^ 

D{r\U[0,u/2], \ • I) ^ cue-^. 

Together with ([22]) and (p3]) we arrive at 

n-l 



1=0 

With n defined as the product measure P (8) X]j=o '^i 

E[||y - Y\Un] ^ c j \Hi\ e'^^ dll{uj, i). (24) 
On the other hand, one has E[i?j] = I{Yi,Yi\H,Y^^^) by definition so that by Lemma l3.ll 

/n— 1 
R, <m{uj, i)=^ E[R,] ^ I{Y, H; Y) ^ I{X; X) ^ r. 

Now consider the minimization problem for the target function 

j \Hi\e-^' dU{uj,i), 

where the minimum is taken over all random variables Ri {i = 0, ...,n — 1) satisfying 
J Ridll{uj,i) ^ r. The law of Hi is (1 — e~^)6o + ^7;^pj^pyi^|[_e,£]c so that 

IHA \ f , l^^l vidx) 

■ — r. 



[ log+ — dU{u}, i) = - [ log ■ 



Hence, Lemma 13.41 implies that the optimal value in the minimization problem is 

j ell{fe^.^o}(^n(a;,j) = -^ne. 
Together with ^ and ([MD we get that 



E[||X-X||] ^ - Ine 
e 



which yields the assertion. 



□ 
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3.3 Lower bound related to the Fi-term 

Theorem 3.5. There exist positive universal constants c\ and C2 such that the following state- 
ments are true. For any e > with Fi(e) ^ 18, one has 

D{ciFi{s),l) ^C2S. 

If i/(M) = oo or a / 0, then for any s > 0, one has 

D{ciFi{s),s) >C2S 

as £ [0. 

Let us give some heuristics on the proof of the theorem. As we have mentioned before the 
drift adjusted process X' needs approximately the time 1/Fi{e) to leave an interval of length 2e. 
Assuming that the process is symmetric the process leaves the stripe to cither of the sides with 
equal probability (here one also needs to assume that one starts in the center of the interval). 
Thus in order to have a coding of accuracy e one needs to describe at least in which direction 
the process left the stripe for most of the exits. This requires about Fi{e) bits. 

As the following remark explains, it suffices to prove the theorem for symmetric Levy pro- 
cesses. 

Remark 3.6. Let X* denote an independent copy of X and observe that for s G (0, 1] 

D{2r I Px-x*,s) ^2V-L)(r- | Fx,s). 

The process X — X* is a symmetric Levy process and the functions describing its complexity 
are 

Fi{£) = 2Fi{e) and F2{e) = 2F2{e). 

We assume from now on that the Levy process X has no drift and a symmetric Levy mea- 
sure v. 

Lemma 3.7. Let e > and denote 



T = mi{t ^ : \Xt\ ^ e}. 

Then 

p(r ^ i) ^ 



4Fi(2e)t 



Proof. We consider a Levy process X* with Levy measure v* = v o n ^ with tt : M ^ 
[—2e, 2e] being the projection onto the interval [— 2e, 2e]. Then the exit times T and 

T* =inf{i ^0: \X;\ ^ e} 

are equal in law. Moreover, the process -X^^:); A_ is a by 3e uniformly bounded martingale and the 
quadratic variation process [X*] of X* is a subordinator with Doob-Meyer Decomposition 

[X*]t = {[X*]t - Ae''Fi{2e) t) + Ae''Fii2s) t. 
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Therefore, 

-2 \ wfv2 



9£^ ^ E(Xf-*) = lim E(X//^r*) 



t— >oo 



lim E[X]tAT* = 4£2-Fi(2£) lim E(i AT*) = 4£2Fi(2£) E(r*). 

t-^oo t— >oo 



Consequently, 

ET* < ^ 



4Fi(2£) 

and the assertion follows immediately. □ 
Lemma 3.8. Let Y be a Bernoulli r.v. Then for d G [0, 1/2] 

D{dlog 2d + (1 - d) log 2(1 -d)\Y, pHam) ^ d, 

where pHam denotes the Hamming distance. 

Proof. Interpret y as a random variable attaining values in the group Z2 consisting of two 
elements. Then p can be interpreted as a difference distortion measure on Z2, that means for 

p{x,x) = p{x - X) := l{a;-x=0}- 

Next, note that for d G [0, 1/2]: 

(f){d) := sup{H{Z) : Z Z2-valued, E[p(Z)] ^ d} = -dlogd- (1 - d) log(l - d). 

We use the concept of the Shannon lower bound to finish the proof: Let Y denote a Z2-valued 
reconstruction with E[p(y, y)] = d ^ 1/2; then 

I{Y; Y) = H{Y) - H{Y\Y) = H{Y) - H{Y - Y\Y) ^ H{Y) - H{Y - Y) 
^ log 2 - (t){d) = dlog 2d + (1 - d) log 2(1 - d). 

□ 

In the proof we will use that for the Bernoulli distribution pP'^^ and Hamming distortion 
pRam Qj^g j^g^g fQj. g^^y ^ g [0, 1/2] that 

L>(dlog2d+(l-d)log2(l-d) I p^^' , p^"""^) = d. 

The proof of the lower bound is based on a comparison with a simpler distortion rate function. 
For q G [0, 1/2] let pq denote the measure that assigns probabilities g to ±1 and 1 — 2(; to 0. 
Moreover denote by /x®" its product measure, consider the distortion measure 

p{x, x) = t{x.£=-i} {x G {±1, 0}, X G {±1}) 

and denote 

n-l 

Pn{x,x) = '^p{Xi,Xi). 

i=0 

As reconstruction we allow any {ibl}"-valued random vector. 
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Proposition 3.9. For any r ^ 0, n G N and any Levy process with symmetric Levy measure, 
one has 

D{r\¥x,s)^^D{r\^if^,Pn,s). 

where 

If 9 \ 

Proof. First fix n G N, r ^ and a reconstruction X with I{X\X) ^ r. We denote I = 1/n 
and consider again 

/ dt \ 

vr : Li[0, 1] ^ (x*) / (2]1{, ^ (2,+i)V2} - y ■ 

\ Jj^i t / I— U,...,n— i 

The map tt is /"^-Lipschitz continuous and the random vector 

Y := (yi)i=o,...,n-i = 7r{X) 

consists of i.i.d. entries. Additionally, we set Y = (Kj)i=o,...,n-i = T^i^)- Next, consider random 
vectors Z = {Zi)i=o,...,n-i and Z = (.^i)j=o,...,n-i defined as 

"■l'''!*^/" and Z.= (' 'f^'*" 
otherwise —1 otherwise. 



Recalling the Lipschitz continuity of tt we get that 

n-l 

■4 



X - 111 > «||y - y||^n ^ijYl p(^i' 



1=0 

Therefore, 

4n 

Certainly, Z is distributed according to where g = P(yi ^ e/4). Since ^ /(^; 

we obtain that in general 

Dir\Fx,s)^^Dir\^if^,Pn,s). 

Next, we show that D{r\fif'^, pn, s) is increasing in q. Indeed, let ^ g < ^ 1/2, let Z 
denote an lU®" distributed r.v., and let Z denote a reconstruction for Z with I{Z; Z) ^ r. More- 
over, let A = {Aq, . . . ,An-i) be a random vector consisting of i.i.d. Bernoulli random variables 
with success probability q/q' that arc independent of Z and Z (for finding such a sequence one 
might need to enlarge the probability space), and set Z := (.^i)i=o,...,n-i '■= (^i -^1)1=0,. ..,n-i- 
Then Z is //^"-distributed and one has 

E[pn{Z, Z)] ^ E[p„(Z, Z)] and I{Z- Z) ^ I{A, Z; Z) = I{Z; Z). 

It remains to prove that f{Yi ^ e/4) > |(l — F\^ejl) ■ fix ? G {0, . . . , tt, — 1} and let 

iXt)te[-i/2,i/2) = i^t+i^i - X^i)te[-i/2,i/2)- 
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The processes (^t)te[o,i/2) and (— ^-t)tg[o,z/2] are independent Levy martingales with Levy mea- 
sure V. Denote = inf{t ^ : ^ e or t ^ 1/2} and observe that 

p(yi ^ -) ^ p(^- / x_t ^ 0, r ^ //4, / [Xt - Xt\ dt^o 

X_t ^ ) P(T ^ //4) P( / [Xt - Xt] dt ^ 0|T ^ //4 



'0 ' ^JT 

Set T = inf{t ^ : ^ e or i ^ ^2}. Then the symmetry of u together with Lemma [37 
imphes that 

P(r+^//4)^lp(T^//4)^i(l ^ 



Fi(2e)/ 



so that 



4^ 8 V Fi{2e)l 



□ 



Lemma 3.10. Lei and p^^™ denote the Bernoulli distribution and the Hamming distance, 
respectively. Then 



D{r\fi„p)^2qD(^^ 



Ber Ham\ 

H- 5 P 



Proof. Let X denote a fiq distributed r.v. and let X denote a {ibl}-valued reconstruction 
with I{X;X) ^ r. Denote f{x) = I{X;X\\X\ = x) for x G {0,1} and let 

f=/(l) and R = f(,\X\). 

Then one has Ei? = ^ I{X;X) ^ r so that due to the non- negativity of R 

_ r r 

^ ^ F{\X\ = 1) ~ 2^' 

Next, we write 

Ep(X,X) =e[]1|^^o}E[V^^}||X|]" 
and note that conditional on |X| = 1, X is a Rademacher random variable so that 

Ep{X,X) ^P(|X| = l)D{f\fi^^\p^'''^). 

Together with the above estimate for f this completes the proof. □ 

Proof of Theorem \'6.5\ 1^* statement. Let e > with Fi{2e) ^ 18 and choose n G N 
maximal with n ^ Fi(2e)/18. Then 

1/ 9n \ 1 
- 1 ^ VO ^ — . 

Fi(2e / 16 
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Additionally, there exists a universal constant C3 > such that n ^ C3i^i(2e). Next, we shall 
apply Proposition 13.91 We fix tq < log 2 arbitrarily and set r = |nro. Then r ^ Ci Fi{2£) for 
some constant Ci only depending on the choice of tq. Thus with Proposition 13.91 one gets 



DiCiFi{2e),s)^D{r,s) ^ ^Z)(^ 



nvQ 



(25) 



''q 1 Fn 



Recall that statement 1 of the theorem considers the case where s = 1. But D^|nro|/i| 
is a distortion rate function for a single letter distortion measure and an i.i.d. original, and, 
therefore, 



D[ —nvQ 



IJ'q,P 



The latter distortion rate function has been related to that of a Bernoulli variable in Lemma [3.10t 



Pq 5 Hr 



Ber Ham 
P ; P 



Since g ^ 1/16 the rate in the last distortion rate function is bounded by rg < log 2 so that the 
distorion rate function yields a value C4 > strictly bigger 0. Altogether, 



D{CiFi{2e),l) ^ -qC^^C22e, 

where C2 = |(C4/8)"'^/^. Switching from 2e to e finishes the proof of the first assertion. 
The proof of the second statement relies on the following concentration property: 



□ 



Lemma 3.11. Let p : M x M — > [0, cxd] he a measurable function, let {Ui)i<z^ he a sequence 
of independent hounded random variables, and denote hy C/*-"^ the random vector (J7j)j=i,...,n- 
Supposing that there exists u* G M such that 



E[p{Ui,u*y] < 00 



(26) 



one has for any s > and r > 0: 



liminf -i:>(nr I p„,s) > d, 



where d = D{r\Ui,p, 1) and pn is the single letter distortion measure belonging to p. 

As one can see in the proof the moment condition (j26p can be easily relaxed. Similar ideas 
are used in to prove concentration of the approximation error. 

Proof. Without loss of generality we assume that D(r\Ui,p) > 0. Our moment condition 
implies that D{-\Ui,p) is finite, convex and continuous on [0,oo). Following the standard proof 
of Shannon's source coding theorem, there is a family of codebooks {C{n))neN such that 

• {(n*,...,u*)} cC(n) CM", 

• log|C(n)| < nr, 

• \imn^oo'^{'T{n)) = 1 for T(n) = {min^{„)gc(„j />„([/("-), u^"")) < {l + e{n))d} and an appro- 
priate zero-sequence (e(n))„gN. 
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For any n G N, let JJ^"-'^^ denote an arbitrary reconstruction for U^"'^ such that we have 
/([/("), J7("'i)) ^ nr, and let f/^^'^) = argmin^(„)gc(n) p„(C/("), We fix 77 G (0, 1) arbitrarily 

and choose 



J= < 



1 if log ^ ^ and p„(i7W, CX^'^)) ^ (1 - 77)^, 



[/(") 

2 else, 



and = UM. 

Next, we will use that 

7(C/{n). ^yH) ^ /([/W. ^(n)^ J) ^ inf if(P^(„)^j^(„)^^||P^(„) ® Q), 

where the infimum is taken over all probability measures Q on R x {1,2} and H denotes the 
relative entropy. We choose 

Q = l[^u(n,i) ^ 5i + Q* ^ S2] with Q* = -i- J2 ^u(-) 



in order to get an appropriate bound for I{U^"^;U^"^): 



/ 



C?IP^(n)^[/(n)^J 



l^S — ' ^ ^ ^r/{") ,[/("), J 



J{j=i} 



Note that the measures P^(„) jj(„) j and P^(n) (j(n,i) j agree on the set {J = 1} so that by 
the construction of J one has log ^ ^ ^(gp'.^^ 1)0^1 ^ ~ Moreover, one has 

dP ( )ig)Q*0'52 ^ l^l*^)! {"^ — 2}- Consequently, we can continue with 



nr. 



/([/(«)^[>W) ^ p(j^ l)nr + P(J = 2)log|C(n)| +log2 < 
On the other hand, basic transformations and the Cauchy-Schwarz Inequality yield 
E[p„(C/W, [>(-))] 

= E[]l{j=i}p„(C/H,;7(-'i))] +E[l{j=2}p4i7H,i7("'2))] 

^ (1 - r/)dP(J = 1) + P(J = 2)(1 + e{n))d + P(r'^)V2E[p„(C/H, (n*, . . . ,u*))2]V' 
~ [(1 - r7)P( J = 1) + P( J = 2)]d 

Therefore, lim^^oo P(<^ = 1) = 0. Consequently, we arrive at 

E[p„(C/('^), [/('^'i))^]!/^ ^ P(J 7^ 1)^/^(1 - r])d ^ (1 - ?7)d 
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and recalling that tj £ (0, 1) was arbitrary finishes the proof. □ 

Proof of Theorem 13. 5|, 2"^ statement. We define rg, q and n as in the proof of the 
first statement. By assumption i/(M) = oo ov a ^ 0. Consequently, one has limej^o -P^i (s) = oo 
and n converges to oo as e tends to 0. 

We recall estimate (125)1: 



D{CiFi{2e),s) ^ D{r,s) ^ ^^^^nr^ 



Now we conclude with Lemma 13.111 that 



q 1 I 



Pn,s] > nD{-ro 



Pq,P 



The assertion follows along the lines of the proof of the first statement. 



□ 
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